Preservation DataStores: New storage paradigm for preservation environments

نویسندگان

  • Simona Rabinovici-Cohen
  • Michael Factor
  • Dalit Naor
  • Leeat Ramati
  • Petra Reshef
  • Shahar Ronen
  • Julian Satran
  • David L. Giaretta
چکیده

ion levels to existing OSD commands. In PDS, both OSD and its HL API are executed in a separate process from the upper layers. The XAM layer communicates with the HL-OSD interface via RPC. XAM session execution model A XAM storage system contains one or more XSystems, where each XSystem is a logical container of XSet records. An XSet, the basic artifact in XAM, is a data Table 1 Preservation DataStores (PDS) functionality and benefits for preservation environments. (AIP: archival information package; PDI: preservation description information.) Preservation-aware storage requirements PDS support Benefits for preservation environments Encapsulate and physically colocate the raw data and its metadata objects, such as RepInfo, provenance, and fixity. & Awareness of the AIP structure. & Manage data availability at the level of an AIP. & Group related objects and create copies of objects according to the inherent importance of the data. & Aggregate the AIPs into clusters so that each cluster is self-contained and can be placed on the same media unit. & Ensures that metadata needed for interpretation is not separated from the raw data and thus reduces the risk of losing the metadata. & Supports graceful loss of data, namely the degree of lost information is proportional to the number of bits lost. Execute data-intensive functions such as fixity computation within the storage. & Provide a storlets container (such as applets in applications and servlets in servers), a container that can embed and execute restricted modules with predefined interfaces. & Lessens network bandwidth. & Reduces risks of data loss. & Utilizes the locality property. Execute transformations internally. & Provide a storlets container that can embed and execute transformations. & Simplifies applications since transformations can be carried out by the storage instead of the application. & Improves performance by reducing bandwidth. & Enables transformations to be applied during the migration process. Include in the stored AIP the RepInfo of its PDI. & Awareness of the PDI structure. & Enables the interpretation of PDI in the future. Handle technical provenance records internally. & Awareness of the provenance metadata and appending internally technical provenance events, such as events related to the migration. & Simplifies applications on top of PDS by reducing the number of events to handle. & Includes richer events that are known only to the storage. Support media migration as opposed to system migration (i.e., migration by physically detaching the media from one system and attaching it to the new system). & Supporting a standardized self-contained, self-describing data format. & Reduces the cost of migration. & Reduces the risks of data loss during migrations. Maintain referential integrity including updating all the links during the migration process so they remain valid in the new system. & Awareness of the context and RepInfo metadata and understanding the fields that represent links to either internal or external locations. & Simplifies applications. & Increases the robustness of the system. S. RABINOVICI-COHEN ET AL. IBM J. RES. & DEV. VOL. 52 NO. 4/5 JULY/SEPTEMBER 2008 394 structure that packages multiple pieces of XSet fields (data and metadata), bundled together for access under a common globally unique external name. A XAM client that requires access to a specific XSystem has to establish a XAM session with the PDS XAM library. A XAM session represents a path to the underlying object storage and serves as a context in which XAM requests can be performed. In order to perform a PDS API call (e.g., ingestAIP), multiple XAM requests are invoked on the same XAM session, for example, create an XSet and create the XSet various fields. Since the PDS API call is expected to be long-lived and to handle large amounts of data, multiple PDS API calls are not handled on the same thread. Instead, each such PDS API call is assigned with a dedicated XAM session executed in a dedicated thread. Transactions in XAM In order to support the XSet behavioral model, the concept of an XSet transaction was introduced. XSet transactions support atomicity, consistency, isolation, and durability. An XSet transaction may include one XAM API call (e.g., deleteXSet) or it may include several XAM API calls. For example, an XSet transaction may begin on openXSet, perform several calls, and end on closeXSet. If closeXSet is called without a prior commit call, the state of the XSet will be rolled back to the state before the transaction began. A commit call will make the changes persistent. PDS interfaces PDS exposes a set of interfaces that form the PDS entry points accompanied with their arguments and return values. The PDS entry points cover the functionality PDS exposes to its users including a variety of ways to ingest and access data and metadata, manipulate previously ingested data and metadata, retrieve PDS system information, and configure policies. The entry points may be called directly or via Web services. The PDS interfaces aim to be abstract and technology independent and to survive implementation replacements. The entry points may return different exceptions that are also PDS interfaces. Some of the structures used by the PDS entry points, such as the inner structure of the PDI record (e.g., the inner structure of a provenance record), may have different variants and may depend on the source that generated the record. PDS aims to treat a set of records that may differ in their inner structure in a harmonic way: Although the provenance records may contain records with different inner structures, PDS still handles them similarly. To enable that, the inner structure of each record has by itself RepInfo, such as an XML schema, that is maintained along with the content of the record. When a record is generated by PDS, we use a PDS default XML schema as the RepInfo for the record. These PDS default schemas are also exposed so that users can make

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preservation DataStores in the Cloud (PDS Cloud): Long Term Digital Preservation in the Cloud

The emergence of the cloud and advanced object-based storage services provides opportunities to support novel models for long term preservation of digital assets. Among the benefits of this approach is leveraging the cloud’s inherent scalability and redundancy to dynamically adapt to evolving needs of digital preservation. Preservation DataStores in the Cloud (PDS Cloud) is an OAIS-based preser...

متن کامل

Collaborative Preservation Infrastructure - A Research Challenge

Digital preservation environments like repositories and archiving systems are traditionally designed to operate autonomously and self-contained. We argue that a paradigm shift towards collaborative preservation environments can greatly improve technical as well as economical factors, and provide major benefits to the user community. The research challenge described here envisions improvements t...

متن کامل

Prototype Preservation Environments

The Persistent Archive Testbed and National Archives and Records Administration (NARA) research prototype persistent archive are examples of preservation environments. Both projects are using data grids to implement data management infrastructure that can manage technology evolution. Data grids are software systems that provide persistent names to digital entities, manage data that are distribu...

متن کامل

Preservation Environments

The long-term preservation of digital entities requires mechanisms to manage the authenticity of massive data collections that are written to archival storage systems. Preservation environments impose authenticity constraints and manage the evolution of the storage system technology by building infrastructure independent solutions. This seeming paradox, the need for large archives, while avoidi...

متن کامل

Combination of Clove and Lemon Basil Essential Oils for Preservation of Chicken Meat

Background: Clove and lemon basil are widely used in Indonesian culinary and known for their antimicrobial properties. This study was designed to identify the chemical constituents of single clove and lemon basil Essential Oils (EOs) as well as determine the potential of the combinations of both EO for preserving chicken meats. Methods: The compositions of clove and lemon basil EOs were evalua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IBM Journal of Research and Development

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2008